Discovering Related Data Sources in Data-Portals

نویسندگان

  • Andreas Wagner
  • Peter Haase
  • Achim Rettinger
  • Holger Lamm
چکیده

To allow effective querying on the Web of data, systems frequently rely on data from multiple sources for answering queries. For instance, a user may wish to combine data from sources comprised in different statistical catalogs. Given such federated queries, in order to enable an interactive exploration of results, systems must allow user involvement during data source selection. That is, a user should be able to choose data sources contributing to query results, thereby allowing to refine/expand current findings. For this, one needs effective recommendations for data sources to be picked: data source contextualization. Recent work, however, solely aims at source contextualization for “Web tables”, while heavily relying on schema information and simple table structures. Addressing these shortcomings, we exploit work from the field of data mining and show how to enable effective Web data source contextualization. Based on a real-world finance use-case, we built a contextualization engine, which we integrated into a Web search system, our data portal, for accessing statistics data sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Increasing Quality of Austrian Open Data by Linking them to Linked Data Sources: Lessons Learned

One of the goals of the ADEQUATe project is to improve the quality of the (tabular) open data being published at two Austrian open data portals by leveraging these tabular data to Linked Data, i. e., (1) classifying columns using Linked Data vocabularies, (2) linking cell values against Linked Data entities, and (3) discovering relations in the data by searching for evidences of such relations ...

متن کامل

Semantic Integration and Interoperability among Portals

INTRODUCTION In distributed settings, such as this of the WorldWide Web , where a large number of information sources and services reside, portals provide a single point of global access via a single and unified view. This view is circumscribed by a specific conceptualization and a specific vocabulary whose entries provide lexicalizations of the concepts used for shaping information, data, and ...

متن کامل

Toward Automated Large-Scale Information Integration and Discovery

The high cost of data consolidation is the key market inhibitor to the adoption of traditional information integration and data warehousing solutions. In this paper, we outline a next-generation integrated database management system that takes traditional information integration, content management, and data warehouse techniques to the next level: the system will be able to integrate a very lar...

متن کامل

Identifying the technical requirements for designing health portals

Aim: Considering technical requirements in the design of health portals increases the validity of information. This study identified the technical and content structure required to create these portals. Methods: This was a qualitative study which was conducted in 2020. A combination of comprehensive review and interview was used. The search was performed in Elsevier, EBSCO, Scopus, Web of Scie...

متن کامل

Finding Quality in Quantity: The Challenge of Discovering Valuable Sources for Integration

Data is becoming a commodity of tremendous value for many domains. This is leading to a rapid increase in the number of data sources and public access data services, such as cloud-based data markets and data portals, that facilitate the collection, publishing and trading of data. Data sources typically exhibit wide variety and heterogeneity in the types or schemas of the data they provide, thei...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013